Previous document

Main menu

End of this document

Next document

2. Vocabulary selection

Let us now deal with another topic that is, in various ways, preliminary to lexicography: it is the problem of selected vocabularies, that is "basic", "simplified" or "defining" vocabularies.

It is common knowledge that in all languages there are words that are used very frequently, the "easy words" that everybody knows, and other words that are used much less, either because they can be replaced by simpler ones (e.g. thing instead of object) or because they refer to some specialised field. More or less the same applies to morphology and syntax, where we can find common, plain, "regular" structures and patterns, as well as anomalies of various kinds and patterns that are seldom used.

This was intuitively recognised long before the development of modern linguistics and led to various proposals of "universal" or at least "international" languages, featuring essential vocabularies, totally regular grammars, and consistent one-to-one relationships between spelling and pronunciation. Suggestions towards the simplification and regularising of natural languages (i.e. languages like English or Italian) were also put forward.

These suggestions are justified by the well-established fact that the most frequent words account for a very high percentage of text. Counts vary a little from language to language, but there seems to be a high degree of convergence towards the following figures:

No. of most frequent words

Percentage of text covered

100

60 to 64%

1000

65 to 74%

1900

75 to 84%

3100

85 to 94%

14500

95+

Frequency, however, is inversely proportional to informativity. If a fragment of text says "... but all that can be said about this is that one cannot do without it, because it is one of those things people will always be in need of..." (all words ranking among the most frequent), we have no idea what the text is about. On the other hand, words like cable or to cancel are much more revealing, and terms like byte leave no doubt about the topic being discussed. Mastery of the appropriate vocabulary is at the basis of precise and rapid communication.

On the whole, we can conclude that vocabularies selected on the basis of some criterion can be very useful in several cases. The first and most obvious criterion is frequency, as we have already seen. The second is distribution: frequency being equal, those words are more useful that can be found in a variety of texts and contexts as opposed to the words that tend to occur only in some topics or types of text. A third criterion is replaceability: some words can easily be replaced with synonyms or phrases, while others are much harder to do without. These criteria will be clearer after we have examined a few selected vocabularies.

2.1 BASIC English (C. K. Ogden, 1925-1932)

Ogden's first objective had been the creation of a new language, based on English but autonomous from it, called British American Scientific Industrial Commercial language; later on, BASIC was reinterpreted as BASIC English, incorporating some rules from the standard language and mostly functioning as a selected vocabulary — where the main criterion for selection was replaceability.

Here are the 850 words of BASIC English subdivided into the five categories proposed by Ogden:

Operations etc. (100)

come get give go keep let make put seem take be do have say see send may will about across after against among at before between by down from in off on over through to under up with as for of till than a the all any every no other some such that this I he you who and because but or if though while how when where why again ever far forward here near now out still then there together well almost enough even little much not only quite so very tomorrow yesterday north south east west please yes

Things (400 General)

account act addition adjustment advertisement agreement air amount amusement animal answer apparatus approval argument art attack attempt attention attraction authority back balance base behaviour belief birth bit bite blood blow body brass bread breath brother building burn burst business butter canvas care cause chalk chance change cloth coal colour comfort committee company comparison competition condition connection control cock copper copy cork cotton cough country cover crack credit crime crush cry current curve damage danger daughter day death debt decision degree design desire destruction detail development digestion direction discovery discussion disease disgust distance distribution division doubt drink driving dust earth edge education effect end error event example exchange existence expansion experience expert fact fall family father fear feeling fiction field fight fire flame flight flower fold food force form friend front fruit glass gold government grain grass grip group growth guide harbour harmony hate hearing heat help history hole hope hour humour ice idea impulse increase industry ink insect instrument insurance interest invention iron jelly join journey judge jump kick kiss knowledge land language laugh law lead learning leather letter level lift light limit linen liquid list look loss love machine man manager mark market mass meal measure meat meeting memory metal middle milk mind mine minute mist money month morning mother motion mountain move music name nation need news night noise note number observation offer oil operation opinion order organisation ornament owner page pain paint paper part paste payment peace person place plant play pleasure point poison polish porter position powder power price print process produce profit property prose protest pull punishment purpose push quality question rain range rate ray reaction reading reason record regret relation religion representative request respect rest reward rhythm rice river road roll room rub rule run salt sand scale science sea seat secretary selection self sense servant sex shade shake shame shock side sign silk silver sister size sky sleep slip slope smash smell smile smoke sneeze snow soap society son song sort sound soup space stage start statement steam steel step stitch stone stop story stretch structure substance sugar suggestion summer support surprise swim system talk taste tax teaching tendency test theory thing thought thunder time tin top touch trade transport trick trouble turn twist unit use value verse vessel view voice walk war wash waste water wave wax way weather week weight wind wine winter woman wood wool word work wound writing year

Things (200 Pictured)

angle ant apple arch arm army baby bag ball band basin basket bath bed bee bell berry bird blade board boat bone book boot bottle box boy brain brake branch brick bridge brush bucket bulb button cake camera card carriage cart cat chain cheese chest chin church circle clock cloud coat collar comb cord cow cup curtain cushion dog door drain drawer dress drop ear egg engine eye face farm feather finger fish flag floor fly foot fork fowl frame garden girl glove goat gun hair hammer hand hat head heart hook horn horse hospital house island jewel kettle key knee knife knot leaf leg library line lip lock map match monkey moon mouth muscle nail neck needle nerve net nose nut office orange oven parcel pen pencil picture pig pin pipe plane plate plough pocket pot potato prison pump rail rat receipt ring rod roof root sail school scissors screw seed sheep shelf ship shirt shoe skin skirt snake sock spade sponge spoon spring square stamp star station stem stick stocking stomach store street sun table tail thread throat thumb ticket toe tongue tooth town train tray tree trousers umbrella wall watch wheel whip whistle window wing wire worm

Qualities (100 General)

able acid angry automatic beautiful black boiling bright broken brown cheap chemical chief clean clear common complex conscious cut deep dependent early elastic electric equal fat fertile first fixed flat free frequent full general good great grey hanging happy hard healthy high hollow important kind like living long male married material medical military natural necessary new normal open parallel past physical political poor possible present private probable quick quiet ready red regular responsible right round same second separate serious sharp smooth sticky stiff straight strong sudden sweet tall thick tight tired true violent waiting warm wet wide wise yellow young

Qualities (50 Opposites)

awake bad bent bitter blue certain cold complete cruel dark dead dear delicate dirty different dry false feeble female foolish future green ill last late left loose loud low mixed narrow old opposite public rough sad safe secret short shut simple slow small soft solid special strange thin white wrong

Notice that operation, thing, general, picture, quality and opposite are all BASIC words, whereas noun, adjective, count, etc. are not, and that is why they have been replaced.

The following rules must be added:

1. Addition of ‘s' to things when there is more than one

2. Endings in ‘er', ‘ing', ‘ed' from 300 names of things

3. ‘ly' forms from qualities

4. Degree with ‘more' and ‘most'

5. Questions by change of order and ‘do'

6. Form-changes in the names of acts, and ‘that', ‘this', ‘I', ‘he', ‘you', ‘who', as in normal English

7. Measures, numbers, days, months and the international words in English form."

Remarks on BASIC English

Replaceability, rather than frequency, was mentioned above as the main criterion in BASIC vocabulary selection. This explains why, with only 16 verbs in the list of "operators", seem is included. Its frequency is not very high but it is difficult to replace it with other words in the list without having recourse to complex expressions. For the same reason (the need to express uncertainty, probability and likelihood) may is in the list and can is not, as in most cases it can be replaced with be able or know how.

On the basis of the second rule, another 300 verbs can be obtained by conversion from words in the lists of "things", i.e. nouns. The interpretation of the rule has been the subject of sharp contrasts among the advocates of BASIC as it was not clear whether it should be applied without exceptions or spelling variants (thus obtaining such forms as *bited, *copyer or *completeing, unacceptable in standard English) or whether standard rules should apply as well. A similar problem concerns the first rule, which would produce such forms as *babys and *foots. Clearly, these choices mark the boundary between BASIC as an autonomous, perfectly regular language, and BASIC English; the latter solution has prevailed — for example, in the rewriting of books in BASIC English — and it is also confirmed by rule no. 6.

According to the third rule, thinner or worst are not BASIC English, more thin and most bad are.

The last rule brings the number of words well above one thousand by adding:

The list of the 850 BASIC words presents curious aspects and some choices that are difficult to explain. The unusual subdivision between "general" and "pictured" things does not coincide with the distinction between abstract and concrete noun. One may well wonder why bread, fruit, glass and paper belong to the group of "general" things although they are clearly picturable; perhaps because they are mass (or non-countable) nouns.

Adjectives are subdivided between "general" and opposites and here, too, we find a lot of strange things:

- among colour, black is ‘general' and white is ‘opposite', but this is not striking because the English phrase for ‘bianco e nero' is black and white; there is no clue as to why brown, grey, red and yellow are ‘general' while blue and green are ‘opposites' — the only hypothesis is that here, as elsewhere, semantic considerations have been abandoned in favour of round figures;

- past and present are ‘general', future is ‘opposite';

- wet is ‘general', dry is ‘opposite';

- cheap is ‘general', dear is ‘opposite'; and so on.

The list includes 13 ing-forms:

- building, driving, feeling, hearing, learning, meeting, reading, teaching and writing are classed among ‘general things';

- boiling, hanging, living and waiting among ‘general qualities';

no one of the respective base forms is found in BASIC English: instead of to write one is to use to do a/some writing. The attempt to minimise the number of verbs is one of the most evident and most questionable aspects of BASIC English.

Besides, it now reveals its old age: we find cart and carriage but not motor, car, traffic or accident. Cough e sneeze were worrying symptoms before antibiotics were developed; now they would probably be replaced by cancer, infarction and perhaps AIDS. In other fields, education and teaching are BASIC but medicine and engineering are not; brass, coal, copper, iron e steel are included but aluminium, plastic and uranium are not; neither is petroleum which, however, can be replaced by oil.

Several more examples could be found for nearly all the semantic fields, as the project was developed between 1925 and 1932 and enormous changes and developments have taken place in the meantime.

In spite of all that, BASIC English has had widespread recognition and has been used to write or "translate" important texts, such as The BASIC Bible, 1950. The project is part of the numerous attempts at creating a language that is regular, simple to learn and to use yet fit for all communicative purposes — attempts that are partly missionary and partly Utopian. Its use here has been to help us in defining some key concepts in lexicology moving from a real example rather than from abstract definitions.

The complete alphabetical list of BASIC English follows; it will be useful later on for comparisons with other lists:

a able about account acid across act addition adjustment advertisement after again against agreement air all almost among amount amusement and angle angry animal answer ant any apparatus apple approval arch argument arm army art as at attack attempt attention attraction authority automatic awake

baby back bad bag balance ball band base basin basket bath be beautiful because bed bee before behaviour belief bell bent berry between bird birth bit bite bitter black blade blood blow blue board boat body boiling bone book boot bottle box boy brain brake branch brass bread breath brick bridge bright broken brother brown brush bucket building bulb burn burst business but butter button by

cake camera canvas card care carriage cart cat cause certain chain chalk chance change cheap cheese chemical chest chief chin church circle clean clear clock cloth cloud coal coat cock cold collar colour comb come comfort committee common company comparison competition complete complex condition connection conscious control copper copy cord cork cotton cough country cover cow crack credit crime cruel crush cry cup current curtain curve cushion cut

damage danger dark daughter day dead dear death debt decision deep degree delicate dependent design desire destruction detail development different digestion direction dirty discovery discussion disease disgust distance distribution division do dog door doubt down drain drawer dress drink driving drop dry dust

ear early earth east edge education effect egg elastic electric end engine enough equal error even event ever every example exchange existence expansion experience expert eye

face fact fall false family far farm fat father fear feather feeble feeling female fertile fiction field fight finger fire first fish fixed flag flame flat flight floor flower fly fold food foolish foot for force fork form forward fowl frame free frequent friend from front fruit full future

garden general get girl give glass glove go goat gold good government grain grass great green grey grip group growth guide gun

hair hammer hand hanging happy harbour hard harmony hat hate have he head healthy hearing heart heat help here high history hole hollow hook hope horn horse hospital hour house how humour

I ice idea if ill important impulse in increase industry ink insect instrument insurance interest invention iron island

jelly jewel join journey judge jump

keep kettle key kick kind kiss knee knife knot knowledge

land language last late laugh law lead leaf learning leather left leg let letter level library lift light like limit line linen lip liquid list little living lock long look loose loss loud love low

machine make male man manager map mark market married mass match material may meal measure meat medical meeting memory metal middle military milk mind mine minute mist mixed money monkey month moon morning mother motion mountain mouth move much muscle music

nail name narrow nation natural near necessary neck need needle nerve net new news night no noise normal north nose not note now number nut

observation of off offer office oil old on only open operation opinion opposite or orange order organisation ornament other out oven over owner

page pain paint paper parallel parcel part past paste payment peace pen pencil person physical picture pig pin pipe place plane plant plate play please pleasure plough pocket point poison polish political poor porter position possible pot potato powder power present price print prison private probable process produce profit property prose protest public pull pump punishment purpose push put

quality question quick quiet quite

rail rain range rat rate ray reaction reading ready reason receipt record red regret regular relation religion representative request respect responsible rest reward rhythm rice right ring river road rod roll roof room root rough round rub rule run

sad safe sail salt same sand say scale school science scissors screw sea seat second secret secretary see seed seem selection self send sense separate serious servant sex shade shake shame sharp sheep shelf ship shirt shock shoe short shut side sign silk silver simple sister size skin skirt sky sleep slip slope slow small smash smell smile smoke smooth snake sneeze snow so soap society sock soft solid some son song sort sound soup south space spade special sponge spoon spring square stage stamp star start statement station steam steel stem step stick sticky stiff still stitch stocking stomach stone stop store story straight strange street stretch strong structure substance such sudden sugar suggestion summer sun support surprise sweet swim system

table tail take talk tall taste tax teaching tendency test than that the then theory there thick thin thing this though thought thread throat through thumb thunder ticket tight till time tin tired to toe together tomorrow tongue tooth top touch town trade train transport tray tree trick trouble trousers true turn twist

umbrella under unit up use

value verse very vessel view violent voice

waiting walk wall war warm wash waste watch water wave wax way weather week weight well west wet wheel when where while whip whistle white who why wide will wind window wine wing winter wire wise with woman wood wool word work worm wound writing wrong

year yellow yes yesterday you young

Previous document

Main menu

Top of this document

Next document